Instrumentation Run Dataparallel C Architectural Linearization

نویسندگان

  • Mark J. Clement
  • Michael J. Quinn
چکیده

Recent advances in the power of parallel computers have made them attractive for solving large computational problems. Scalable parallel programs are particularly well suited to Massively Parallel Processing (MPP) machines since the number of computations can be increased to match the available number of processors. Performance tuning can be particularly dii-cult for these applications since it must often be performed with a smaller problem size than that targeted for eventual execution. This research develops a performance prediction methodology that addresses this problem through symbolic analysis of program source code. Algebraic manipulations can then be performed on the resulting analytical model to determine performance for scaled up applications on diierent hardware architectures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Instrumentation and Optimization for GPU Applications

Parallel architectures like GPUs are a tantalizing compute fabric for performance-hungry developers. While GPUs enable order-of-magnitude performance increases in many dataparallel application domains, writing efficient codes that can actually manifest those increases is a non-trivial endeavor, typically requiring developers to exercise specialized architectural features exposed directly in the...

متن کامل

HP Caliper : A Framework for Performance Analysis

You perform statistical sampling by taking periodic snapshots of a program’s state. Statistical sampling is nonintrusive—unlike binary instrumentation, statistical sampling doesn’t add any lines of code to the application being tested—but the computing community generally regards this technique as imprecise. It imposes low overhead on a program’s runtime performance and can be used for time-cri...

متن کامل

Performance Analysis of Large-Scale OpenMP and Hybrid MPI/OpenMP Applications with Vampir NG

This paper presents a tool setup for comprehensive eventbased performance analysis of large-scale openmp and hybrid openmp/ mpi applications. The kojak framework is used for portable code instrumentation and automatic analysis while the new VampirNG infrastructure serves as generic visualization engine for both openmp and mpi performance properties. The tools share the same data base which enab...

متن کامل

Implementation of the Parallel Superposition in Bulk-Synchronous Parallel ML

Bulk-Synchronous Parallel ML (BSML) is a functional dataparallel language to code Bulk-Synchronous Parallel (BSP) algorithms. It allows an estimation of execution time, avoids deadlocks and nondeterminism. This paper presents the implementation of a new primitive for BSML which can express divide-and-conquer algorithms.

متن کامل

Fundamental issues in designing data - parallel data ow computers

This paper analyses the fundamental primitives in the data-parallel computational model and proposes architectural solutions to these within the framework of a dataparallel data ow computer. The collective behaviour of this paradigm enables the use of a novel caching mechanism to be used in conjunction with an ETS matching store. It is also shown how collective behaviour may be exploited in opt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995